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DETAILED ACTION 
Claim Rejections - 35 USC § 103 

1. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all obviousness 

rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

2. Claims 1-3, 7-10, 14, 16-18, 22-26, and 30 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over US 6,847,365 Bi to Miller et al in view of US 4,725,973 to Matsuura et al. 

a. Regarding claims 1, 16, and 24, Miller et al. discloses configuring a plurality of 
processing elements (MPEo, MPEi...Fig. 3) into a two-dimensional array of processing 
elements (...media processor 32 comprises a communication bus 60, a main bus 62 and a 
supplemental (supp) bus 64, all of which are used to connect various processing 
elements and sub-system units of processor 32.... processor 32 preferably comprises a 
first processing element (MPEo) 66, a second processing element (MPEi)...col. 5, lines 
50-60;. ..although four MPEs are shown, the invention is not limited to any particular 
number of MPEs and may have as few as one MPE or a large number of them... col. 7, 
lines 3-10) such that each processing element includes a plurality of vector registers 
(...for example, to store a vector value that requires 128 bits, four of the 32-bit physical 
registers are logically combined together. Thus, the first defined logical vector register, 
Vo may be stored in physical registers R0-R3...C01. 14, lines 25-45),, a plurality of scalar 
registers (...so the lower half of each scalar register 174-180:. .similarly, a pixel data type 
182 also may comprise four scalar registers 184, 186, 188 and 190.. .col. 13, lines 30-40; 
Fig. 6);, and a plurality of arithmetic logic units (...each MPE 66-72 preferably comprises 
a single instruction stream, multiple data stream (SIMD) internal hardware architecture, 
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and a Very Long Instruction Word (VLIW) architecture. The VLIW architecture includes 
a plurality of parallel processing units, such as an arithmetic logic unit, a multiplier unit, 
etc; so that each processing unit may execute its own instruction on its own data. Thus 
media processor 32, as discussed above, comprises a parallel architecture, and each of 
the MPEs 66-72 within media processor 32 also preferably comprises of parallel 
architecture. ..col. 10, lines 1-13;... architecture 100 of the MPEs may have a plurality of 
sub-units, such as an execution control unit (ECU) 106.. .a register control unit (RCU) 
110, an arithmetic logic unit (ALU) 112, a multiplication processing unit (MUL) 114, and 
a register file 116.. .col. 10, lines 27-31) wherein a data path of each processing element 
includes a set of processing element slices each coupled to one arithmetic logic unit 
(...ECM 106, MEM 108, RCU 110, ALU 112 and MUL 114 all are connected together in 
parallel via register file 116. ..col. 10, lines 32-35) such that each arithmetic logic unit 
(ALU 112) receives a specified portion of each vector register as input (..ALU 112 
performs arithmetic and logical operations on data that typically is stored in register file 
116, and also may be optimized for pixel operations.. .col. 11, lines 14-25;. ..Fig. 6, a 
diagram illustrating the various data types that are supported... by each MPE 66-72... the 
simplest data type is a scalar type 160. ..for example, a vector data type 162 may comprise 
four 32-bit scalar data types... col. 13, lines 22-51;.. .For example, R0-R3 may be 
combined to store a first pixel,. ...and R12-R15 may store a vector.. .col. 14, lines 24- 
45;. ..ALU 112 is flexible because the inputs and outputs of ALU 112 may be directed to 
and from a variety of different sources... col. 16, lines 47-67;. ..Fig. 10 is a diagram 
showing how an MUL unit 214 processes a pixel or small vector data type... col. 17, lines 
57-67); configuring a video stream into data blocks (...the invention is directed to a novel 
processing architecture, that can decompress and process video data... the compressed 
media stream, that may include audio and visual data, enters a media processing system 



Application/Control Number: 10/815,329 Page 4 

Art Unit: 2671 

30.. .col. 3, lines 33-64.. .the operation of processor 32 will be explained by way of an 
example of processing an MPEG video data stream... col. 6, lines 49-67); loading the data 
blocks into the plurality of vector registers within each processing element (...As 
shown,... pixel maps are 16 bits, and may be stored in DRAM... loaded into MPE 
registers... Pixel map type 4 has 32 bits per pixel.. .this type of pixel map may be stored in 
a vector register... col. 16, lines 10-35); processing the read portions by the arithmetic 
logic units such that the data blocks from the plurality of vector registers are processed 
in parallel (...this particular parallel bus structure permits an increased amount of data 
to be processed by media processing system 30.. .col. 6, lines 22-30). Miller et al. 
implicitly discloses plurality of block registers which function similar to vector 
registers (Specification of the instant application at paragraph 17 discloses vector or 
block registers holding data blocks). However, Miller et al. fails to disclose reading the 
specified portions of each vector register by each of the corresponding arithmetic logic 
units within all processing elements simultaneously. Matsuura et al. discloses 
concurrent execution of the vector instruction by a plurality of arithmetic logic units 
(...and these two element series are simultaneously read from the vector register array 
and are concurrently processed by two arithmetic logic units; then the results obtained 
from these two arithmetic logic units are simultaneously written in the vector register 
array., .col. 3, lines 13-28). Therefore, it would have been obvious to a person of ordinary 
skill in the art at the time invention was made to modify the device as taught by Miller et 
al. with the feature "simultaneous read or write operations by arithmetic logic units" as 
taught by Matsuura et al. because it provides for improved processing time of data, 
b. Regarding claims 2, and 17, Miller et al. discloses wherein loading the data 
blocks into the plurality of vector registers, reading the specified portions of each vector 
register, processing the read portions, and writing the results of the processing is 
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performed within one processing element clock cycle (...thus, an entire pixel or small 
vector may be multiplied by a second entire pixel or small vector in a single clock cycle 
using MUL unit 114, thus increasing the processing speed of the media processing 
system for graphics applications that use pixels. ..col. 18, lines 1-10). 

c. Regarding claims 3, 10, 18, and 26, Miller et al. discloses wherein each 
arithmetic logic unit is further configured to receive the contents of one of the scalar 
registers as input (...ALU 112 performs arithmetic and logical operations on data that 
typically is stored in register file 116. ..col. 11, lines 14-25). 

d. Regarding claims 7, 14, 22, and 30, Miller et al. discloses register file 116 being 
used to temporarily store data that is being processed by any of processing units 106-114, 
and functions as block registers of the instant claim (...Register file 116 may be used to 
temporarily store data that is being processed by... processing units 106-114 in the MPE. 
For example, when two numbers are added together, each number may be loaded into 
registers in register file 116, and then the registers are added together with the result 
being stored in a register... register file 116 maybe reconfigured to store various different 
types of data, including video processing specific data types, such as pixels... col. 10, lines 
45-53). 

e. Regarding claims 8, and 23, Miller et al. discloses ALU unit 112 returning data 
back to register file 116 (...As shown in Fig. 4, ALU unit 112 and MUL unit 114 both can 
return data back to register file 116 via return register file ports 126 and 128 respectively, 
so that the result data may be stored in one of the register file registers... col. 11, lines 22- 
25). 

f. Regarding claim 9, Miller et al. discloses a main memory (element 46, Fig. 2); 
and the rest of the instant claim limitation is similar to claim 1 above and is rejected 
under the same rationale. 
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g. Regarding claim 25, it is similar in scope to claim 9 above and is rejected under 
the same rationale. 

3. Claims 4, 11, 19, and 27 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
US 6,847,365 Bi to Miller et al in view of US 4,725,973 to Matsuura et al. as applied to claim 1 
above, and further in view of US 5,226,171 to Hall et al. 

a. Regarding claims 4, 11, 19, and 27, Miller-Matsuura combination fails to 
disclose wherein when the contents of one of the scalar registers is used as input, a 
scalar register value is broadcast from the one scalar register to all arithmetic logic units 
within a given processing elements. Hall et al. discloses column elements of a matrix 
M being broadcast to all the arithmetic units (...the arithmetic unit also includes a first 
set of four vector registers 48 and a first set of scalar registers 50 which selectively 
receive data input from the bus means 40... col. 4, lines 9-15;... the same column elements 
of a matrix M (at 36 in Fig. 3) are broadcast to all the arithmetic units employing 
synchronous memory bus means.. .col. 4, lines 59-67;. ..as hereinbefore indicated, a 
common input operand may be broadcast to a plurality of the arithmetic units 16-30, 
with each one performing calculations with respect to data theretofore stored 
therein. ..col. 5, lines 44~49)- Therefore, it would have been obvious to a person of 
ordinary skill in the art at the time invention was made to modify the device as taught by 
Miller_Matsuura combination with the feature "scalar register value being broadcast to 
all arithmetic logic units" as taught by Hall et al. because this provides for substantial 
size calculation to be performed resulting in faster processing (col. 5, lines 1-2). 

4. Claims 15 and 31 are rejected under 35 U.S.C. 103(a) as being unpatentable over US 
6,847,365 Bi to Miller et al in view of US 4,725,973 to Matsuura et al. as applied to claim 9 
above, and further in view of "A bit-serial VLSI array processing chip for image processing" 
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Heaton, R.; Blevins, D.; Davis, E.; Solid-State Circuits, IEEE Journal of Volume 25, Issue 2, 
April 1990 Page(s): 364 - 368 Digital Object Identifier 10.1109/4.52157. 

a. Regarding claims 15 and 31, Miller-Matsuura combination does not disclose 
wherein each processing element comprising a mask register to exclude processing of the 
processing element. "A bit-serial VLSI array processing chip for image processing" 
Heaton, R.; Blevins, D.; Davis, E.; Solid-State Circuits, IEEE Journal of Volume 25, 
Issue 2, April 1990 Page(s): 364 - 368 discloses when a "maskable" instruction is 
issued, all PE's whose G bits are set to zero will be masked and will not perform the 
broadcast operation, All other PE's, with their G registers set, will execute the instruction 
normally (see Fig. 7). Using this approach, PE's can be selectively enabled and disabled. 
Therefore, it would have been obvious to a person of ordinary skill in the art at the time 
invention was made to modify the device as taught by Miller-Matsuura combination with 
the feature "selective enabling and disabling of PE's" as taught by "A bit-serial VLSI 
array processing chip for image processing" Heaton, R.; Blevins, D.; Davis, E." 
because it provides for efficient use of PE's resulting in efficient PE operations. 

Allowable Subject Matter 
5. Claims 5, 6, 12, 13, 20, 21, 28, and 29 are objected to as being dependent upon a rejected 
base claim, but would be allowable if rewritten in independent form including all of the 
limitations of the base claim and any intervening claims. Prior art fails to disclose wherein each 
processing element further comprises a local accumulation register to accumulate the results of 
the plurality of arithmetic logic units within the processing element, further wherein each 
processing slice is coupled to a slice accumulator such that each slice accumulator buffers the 
results from any one of the local accumulation registers corresponding to the processing 
elements of the processing slice; and further comprising accumulating the results of each slice 
accumulator into a global accumulator register. US 6,601,077 Bi to Aldrich et al. discloses a 
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DSP unit for multi-level global accumulation (..the DSP 100 includes... an arithmetic logic unit 
(ALU) 165, and an accumulator 175. ..col. 1, lines 63-67) but fails to disclose slice accumulator 
buffering the results from any one of the local accumulation registers corresponding to the 
processing elements of the processing slice and accumulating the results of each slice 
accumulator into a global accumulator register. 

Conclusion 

6. The following prior art made of record but not relied upon is pertinent to the invention. 
US 6,366,998 Bi to Mohamed (Reconfigurable Functional Units for implementing a 
hybrid VLIW-SIMD programming model); US 6,820,102 B2 to Aldrich et al. (DSP unit 
for multi-level global accumulation); US 5,729,758 to Inoue et al. (SIMD processor 
operating with a plurality of parallel processing...); US 4,633,389 to Tanaka et al. (Vector 
processor system comprised of plural vector processors); US 6,138,136 to Bauer et al. 
(Signal processor); US 5,019,969 to Izumisawa et al. (Computer system for directly 
transferring vector elements from register to register using a single instruction); US 
6,809,422 B2 to Saito et al. (One-chip image processing apparatus). 

7. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Dalip K. Singh whose telephone number is (571) 272-7792. 
The examiner can normally be reached on Mon-Friday (io:ooAM-6: 30PM). 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Ulka Chauhan, can be reached at (571) 272-7782. 

Information regarding the status of an application may be obtained from the Patent 
Application Information Retrieval (PAIR) system. Status information for published applications 
may be obtained from either Private PAIR or Public PAIR. Status information for unpublished 
applications is available through Private PAIR only. For more information about the PAIR 
system, see http://pair-direct.uspto.gov . Should you have questions on access to the Private 
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PAIR system, please contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). 
Please note that the new Central Official FAX number for application specific communications 
with the USPTO is 571-273-8300 (effective July 15, 2005). 

Dalip K. Singh 
Examiner, Art Unit 2671 

dks 

March 14, 2006 



ULKA CHAUHAN 
SUPERVISORY PATENT EXAMINER 




